Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
Genomics & Informatics ; : e40-2019.
Article in English | WPRIM | ID: wpr-830121

ABSTRACT

While studies aimed at detecting and analyzing indels or single nucleotide polymorphisms within human genomic sequences have been actively conducted, studies on detecting long insertions/deletions are not easy to orchestrate. For the last 10 years, the availability of long read data of human genomes from PacBio or Nanopore platforms has increased, which makes it easier to detect long insertions/deletions. However, because long read data have a critical disadvantage due to their relatively high cost, many next generation sequencing data are produced mainly by short read sequencing machines. Here, we constructed programs to detect so-called unmapped regions (UMRs, where no reads are mapped on the reference genome), scanned 40 Korean genomes to select UMR long deletion candidates, and compared the candidates with the long deletion break points within the genomes available from the 1000 Genomes Project (1KGP). An average of about 36,000 UMRs were found in the 40 Korean genomes tested, 284 UMRs were common across the 40 genomes, and a total of 37,943 UMRs were found. Compared with the 74,045 break points provided by the 1KGP, 30,698 UMRs overlapped. As the number of compared samples increased from 1 to 40, the number of UMRs that overlapped with the break points also increased. This eventually reached a peak of 80.9% of the total UMRs found in this study. As the total number of overlapped UMRs could probably grow to encompass 74,045 break points with the inclusion of more Korean genomes, this approach could be practically useful for studies on long deletions utilizing short read data.

2.
Genomics & Informatics ; : 40-2019.
Article in English | WPRIM | ID: wpr-785801

ABSTRACT

While studies aimed at detecting and analyzing indels or single nucleotide polymorphisms within human genomic sequences have been actively conducted, studies on detecting long insertions/deletions are not easy to orchestrate. For the last 10 years, the availability of long read data of human genomes from PacBio or Nanopore platforms has increased, which makes it easier to detect long insertions/deletions. However, because long read data have a critical disadvantage due to their relatively high cost, many next generation sequencing data are produced mainly by short read sequencing machines. Here, we constructed programs to detect so-called unmapped regions (UMRs, where no reads are mapped on the reference genome), scanned 40 Korean genomes to select UMR long deletion candidates, and compared the candidates with the long deletion break points within the genomes available from the 1000 Genomes Project (1KGP). An average of about 36,000 UMRs were found in the 40 Korean genomes tested, 284 UMRs were common across the 40 genomes, and a total of 37,943 UMRs were found. Compared with the 74,045 break points provided by the 1KGP, 30,698 UMRs overlapped. As the number of compared samples increased from 1 to 40, the number of UMRs that overlapped with the break points also increased. This eventually reached a peak of 80.9% of the total UMRs found in this study. As the total number of overlapped UMRs could probably grow to encompass 74,045 break points with the inclusion of more Korean genomes, this approach could be practically useful for studies on long deletions utilizing short read data.


Subject(s)
Humans , Genome , Genome, Human , Nanopores , Polymorphism, Single Nucleotide
3.
Journal of Korean Medical Science ; : 817-824, 2017.
Article in English | WPRIM | ID: wpr-156646

ABSTRACT

Necrotizing enterocolitis (NEC) characterized by inflammatory intestinal necrosis is a major cause of mortality and morbidity in newborns. Deep RNA sequencing (RNA-Seq) has recently emerged as a powerful technology enabling better quantification of gene expression than microarrays with a lower background signal. A total of 10 transcriptomes from 5 pairs of NEC lesions and adjacent normal tissues obtained from preterm infants with NEC were analyzed. As a result, a total of 65 genes (57 down-regulated and 8 up-regulated) revealed significantly different expression levels in the NEC lesion compared to the adjacent normal region, based on a significance at fold change ≥ 1.5 and P ≤ 0.05. The most significant gene, DPF3 (P < 0.001), has recently been reported to have differential expressions in colon segments. Our gene ontology analysis between NEC lesion and adjacent normal tissues showed that down-regulated genes were included in nervous system development with the most significance (P = 9.3 × 10⁻⁷; P(corr) = 0.0003). In further pathway analysis using Pathway Express based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, genes involved in thyroid cancer and axon guidance were predicted to be associated with different expression (P(corr) = 0.008 and 0.020, respectively). Although further replications using a larger sample size and functional evaluations are needed, our results suggest that altered gene expression and the genes' involved functional pathways and categories may provide insight into NEC development and aid in future research.


Subject(s)
Humans , Infant, Newborn , Axons , Colon , Enterocolitis, Necrotizing , Gene Expression Profiling , Gene Expression , Gene Ontology , Genome , Infant, Premature , Mortality , Necrosis , Nervous System , Pilot Projects , Sample Size , Sequence Analysis, RNA , Thyroid Neoplasms , Transcriptome
4.
Allergy, Asthma & Immunology Research ; : 142-148, 2014.
Article in English | WPRIM | ID: wpr-19427

ABSTRACT

PURPOSE: Endoplasmic reticulum (ER) stress has recently been observed to activate NF-kappaB and induce inflammatory responses such as asthma. Activating transcription factor 6beta (ATF6B) is known to regulate ATFalpha-mediated ER stress response. The aim of this study is to investigate the associations of ATF6B genetic variants with aspirin-exacerbated respiratory disease (AERD) and its major phenotype, % decline of FEV1 by aspirin provocation. METHODS: Four common single nucleotide polymorphisms (SNPs) of ATF6B were genotyped and statistically analyzed in 93 AERD patients and 96 aspirin-tolerant asthma (ATA) as controls. RESULTS: Logistic analysis revealed that 2 SNPs (rs2228628 and rs8111, P=0.008; corrected P=0.03) and 1 haplotype (ATF6B-ht4, P=0.005; corrected P=0.02) were significantly associated with % decline of FEV1 by aspirin provocation, whereas ATF6B polymorphisms and haplotypes were not associated with the risk of AERD. CONCLUSIONS: Although further functional and replication studies are needed, our preliminary findings suggest that ATF6B may be related to obstructive phenotypes in response to aspirin exposure in adult asthmatics.


Subject(s)
Adult , Humans , Aspirin , Asthma , Endoplasmic Reticulum , Haplotypes , Methods , NF-kappa B , Phenotype , Polymorphism, Single Nucleotide , Transcription Factors
5.
Genomics & Informatics ; : 88-98, 2012.
Article in English | WPRIM | ID: wpr-141259

ABSTRACT

Lipoprotein lipase (LPL) plays an essential role in the regulation of high-density lipoprotein cholesterol (HDLC) and triglyceride levels, which have been closely associated with cardiovascular diseases. Genetic studies in European have shown that LPL single-nucleotide polymorphisms (SNPs) are strongly associated with lipid levels. However, studies about the influence of interactions between LPL SNPs and lifestyle factors have not been sufficiently performed. Here, we examine if LPL polymorphisms, as well as their interaction with lifestyle factors, influence lipid concentrations in a Korean population. A two-stage association study was performed using genotype data for SNPs on the LPL gene, including the 3' flanking region from 7,536 (stage 1) and 3,703 (stage 2) individuals. The association study showed that 15 SNPs and 4 haplotypes were strongly associated with HDLC (lowest p = 2.86 x 10(-22)) and triglyceride levels (lowest p = 3.0 x 10(-15)). Interactions between LPL polymorphisms and lifestyle factors (lowest p = 9.6 x 10(-4)) were also observed on lipid concentrations. These findings suggest that there are interaction effects of LPL polymorphisms with lifestyle variables, including energy intake, fat intake, smoking, and alcohol consumption, as well as effects of LPL polymorphisms themselves, on lipid concentrations in a Korean population.


Subject(s)
3' Flanking Region , Alcohol Drinking , Cardiovascular Diseases , Cholesterol , Cross-Sectional Studies , Energy Intake , Genotype , Haplotypes , Life Style , Lipoprotein Lipase , Lipoproteins , Polymorphism, Single Nucleotide , Smoke , Smoking
6.
Genomics & Informatics ; : 88-98, 2012.
Article in English | WPRIM | ID: wpr-141258

ABSTRACT

Lipoprotein lipase (LPL) plays an essential role in the regulation of high-density lipoprotein cholesterol (HDLC) and triglyceride levels, which have been closely associated with cardiovascular diseases. Genetic studies in European have shown that LPL single-nucleotide polymorphisms (SNPs) are strongly associated with lipid levels. However, studies about the influence of interactions between LPL SNPs and lifestyle factors have not been sufficiently performed. Here, we examine if LPL polymorphisms, as well as their interaction with lifestyle factors, influence lipid concentrations in a Korean population. A two-stage association study was performed using genotype data for SNPs on the LPL gene, including the 3' flanking region from 7,536 (stage 1) and 3,703 (stage 2) individuals. The association study showed that 15 SNPs and 4 haplotypes were strongly associated with HDLC (lowest p = 2.86 x 10(-22)) and triglyceride levels (lowest p = 3.0 x 10(-15)). Interactions between LPL polymorphisms and lifestyle factors (lowest p = 9.6 x 10(-4)) were also observed on lipid concentrations. These findings suggest that there are interaction effects of LPL polymorphisms with lifestyle variables, including energy intake, fat intake, smoking, and alcohol consumption, as well as effects of LPL polymorphisms themselves, on lipid concentrations in a Korean population.


Subject(s)
3' Flanking Region , Alcohol Drinking , Cardiovascular Diseases , Cholesterol , Cross-Sectional Studies , Energy Intake , Genotype , Haplotypes , Life Style , Lipoprotein Lipase , Lipoproteins , Polymorphism, Single Nucleotide , Smoke , Smoking
7.
Genomics & Informatics ; : 148-151, 2009.
Article in English | WPRIM | ID: wpr-10792

ABSTRACT

Generally, larger sample size leads to a greater statistical power to detect a significant difference. We may increase the sample size for both case and control in order to obtain greater power. However, it is often the case that increasing sample size for case is not feasible for a variety of reasons. In order to look at change in power as the ratio of control to case varies (1:1 to 4:1), we conduct association tests with simulated data generated by PLINK. The simulated data consist of 50 disease SNPs and 300 non-disease SNPs and we compute powers for disease SNPs. Genetic Power Calculator was used for computing powers with varying the ratio of control to case (1:1, 2:1, 3:1, 4:1). In this study, we show that gains in statistical power resulting from increasing the ratio of control to case are substantial for the simulated data. Similar results might be expected for real data.


Subject(s)
Case-Control Studies , Polymorphism, Single Nucleotide , Sample Size
8.
Genomics & Informatics ; : 149-153, 2005.
Article in English | WPRIM | ID: wpr-191505

ABSTRACT

Asthma is an inflammatory airways disease characterized by bronchial hyperresponsiveness and airways obstruction, which results from a complex interaction of genetic and environmental factors. Interleukin (IL)-13 and IL-4 are important in IgE synthesis and allergic inflammation, therefore genes encoding IL-13 and IL-4 are candidates for predisposition to asthma. In the present study, we screened single-nucleotide polymorphisms (SNPs) in IL-13 and IL-4 and examined whether they are risk factors for asthma. We resequenced all exons and the promoter region in 12 asthma patients and 12 normal controls, and identified 18 SNPs including 2 novel SNPs. The linkage disequilibrium(LD) pattern was evaluated with 16 common SNPs, and haplotypes were also estimated within the block. Although IL-13 and IL-4 are localized within 27 kb on chromosome 5q31 and share many biological profiles, this region was partitioned into 2 blocks. One SNP and three SNPs were determined as haplotype-taggingSNPs (htSNPs) within IL-13 and IL-4 haplotype-block, respectively. No significant associations were observed between any of the SNPs or haplotypes and development of asthma in small number of Korean subjects. However, the genetic variants of IL-13 and IL-4 would provide valuable strategies for the genotyping studies in large population.


Subject(s)
Humans , Asthma , Exons , Haplotypes , Immunoglobulin E , Inflammation , Interleukin-13 , Interleukin-4 , Interleukins , Linkage Disequilibrium , Polymorphism, Single Nucleotide , Promoter Regions, Genetic , Risk Factors
9.
Genomics & Informatics ; : 180-183, 2004.
Article in English | WPRIM | ID: wpr-13645

ABSTRACT

Comparative Statistic Module(CSM) provides more reliable list of significant genes to genomics researchers by offering the commonly selected genes and a method of choice by calculating the rank of each statistical test based on the average ranking of common genes across the five statistical methods, i.e. t-test, Kruskal-Wallis (Wilcoxon signed rank) test, SAM, two sample multiple test, and Empirical Bayesian test. This statistical analysis module is implemented in Perl, and R languages.


Subject(s)
Genomics
10.
Genomics & Informatics ; : 131-133, 2004.
Article in English | WPRIM | ID: wpr-105280

ABSTRACT

MediScore is an information retrieval system, which helps to search for the set of genes associated with a specific disease or the set of diseases associated with a specific gene. Despite recent improvement of natural language processing (NLP) and other text mining approaches to search for disease associated genes, many false positive results come out due to diversity of exceptional cases as well as ambiguities in gene names. In order to overcome the weak points of current text mining approaches, MediScore introduces statistical normalization based on binomial to normal distribution approximation which corrects inaccurate scores caused by common words not representing genes and interactive rescoring by the user to remove the false positive results. Interactive rescoring includes individual alias scoring for each gene to remove false gene synonyms, referring MEDLINE abstracts, and cross referencing between OMIM and other related information.


Subject(s)
Data Mining , Databases, Genetic , Information Systems , Natural Language Processing
SELECTION OF CITATIONS
SEARCH DETAIL